Technology Capacity Management Techniques
Forecasting And Modelling
The prime objective of capacity management is to predict the behaviour of IT services under a given volume and variety
of work. Technology capacity forecasting is the process of modelling and forecasting the technology capacity required
by an engagement, to meet the demands of the IT services. Forecasting reduces the risk of technology capacity-related
performance and availability issues. This method helps to understand the uncertainty of the future by relying mainly on
data from the past and present, and analysis of trends.
To provide a solid forecast the future business development and its impact upon the level of resource utilization for
the relevant service components, needs to be identified. By comparing these requirements against the current levels of
resource utilization (which can be obtained from an accurate CMDB), it should be possible not only to generate a
forecast, but also provide propositions to meet business requirements. Agreeing on which propositions should be
implemented and setting priorities, should be done with the customer as these involve financial and environmental
consequences.
Modelling tells about what workload can be supported with given resources or what service can be provided for a given
workload. Workload is the amount of a resource use in a certain period. It usually indicates the throughput of work for
certain group of users or functions in an engagement. The first stage in modelling is to create a baseline model that
reflects accurately the performance that is currently being achieved. When this baseline model has been created,
predictive modelling can be done, i.e. ask the ‘What if?’ questions that reflect failures, planned changes to the
hardware and/or the volume/variety of workloads. If the baseline model is accurate, then the accuracy of the result of
the potential failures and changes can be trusted.
There are different types of capacity modelling techniques from making estimates based on information on current
resource utilization and experience to making prototypes, full scale benchmarks and pilot studies. These techniques
have their good and bad sides and are suitable for different scenarios. With all types of modelling, similar levels of
accuracy can be obtained but all are totally dependent on the skill of the person constructing the model and the
information used to create it. The three most popular capacity modeling techniques are trending, simulation modelling
and analytical modelling.
Trending
Trending, also known as a trend analysis, is a modeling technique where historical data about resource utilization and
service performance is used to forecast future behavior. The historical data is usually analyzed in a spreadsheet where
graphical, trending and forecasting facilities are used to show resource utilization over a time and how it is likely
to change in the future.
For the things that behave in straight lines trending provides a sufficient analysis. In practice, there are some cases
where trending is a viable approach and in some other cases it is not. There are two key points that decide if it is a
good approach. The first one is that there should not be any unknown discontinuities. The second one is that the
underlying attributes of a system should be linear.
Trending is most effective when there are a small number of variables and there is a linear relationship between them.
It is quite affordable modelling technique, but only provides estimates on future resource utilization. Trend analysis
is less effective in producing an accurate estimate of response times, in which case either analytical or simulation
modelling should be used.
Simulation Modelling
Simulation modelling is a technique in which simulation models are based on computer programs that emulate static
structure and the different dynamic aspects of a system. Simulation modelling is used before making decisions on load
allocation.
Simulation involves the modelling of discrete events (for example, transaction arrival rates) against a given hardware
configuration. This is the reason it is also known as discrete event simulation. This type of modelling can be very
accurate in sizing new applications or predicting the effects of changes on existing applications. Downside of this
technique is that it often takes a long time to build and execute the model and therefore it is costly.
While simulating transaction arrival rates, two approach can be followed to input the data. Either staff can be used to
enter a series of transactions from prepared scripts or software is used to directly input the same scripted
transactions with a random arrival rate. Either of these approaches takes time and effort to prepare and run. However,
it can be cost-justified for large engagements where the major cost and the associated performance implications assume
great importance.
If the response time predicted by the model is sufficiently close to the response times in the actual production
environment, then this model is effective. As this model is quick to build and uses little resources compared to
Analytical modelling the results are satisfactory.
Analytical Modelling
Analytical models are constructed and used by capacity planners to predict computing resource requirements related to
workload behaviour and volume changes. These are mathematical models that have a closed form solution, i.e. the
solution to the equations used to describe changes in a system can be expressed as a mathematical analytic function.
Analytical models are representations of the behaviour of computer systems using mathematical techniques – for example,
multi-class network queuing theory. Analytical models are used to assess current performance and predict future
performance.
Typically, a model is built using software packages by specifying within the package the components and structure of
the configuration that need to be modelled, and the utilization of the components – for example, processor, memory and
disks – by the various workloads or applications. When the model is run, the queuing theory is used to calculate the
response times in the computer system. If the response times predicted by the model are sufficiently close to the
response times recorded in real life, the model can be regarded as an accurate representation of the computer
system.
To be mathematically flexible, analytical models include usually a little detail and therefore they tend to be
efficient to run, but not so accurate as other modelling techniques. This model does not take much time to create but
must be updated frequently.
Sizing
System Sizing
Based on the information, received from technology capacity management, sizing of the IT Infrastructure and
Organisation to support the agreed service can be undertaken. Sizing should be undertaken together with specialists to
understand the IT Components, engagement management to understand the KPI aspects and service delivery management to
understand the resource aspects.
Any system must be sized to respond adequately during peak demand which differs from time to time. To effectively
estimate capacity requirements, engagements must identify the demand period for all resources at unplanned peak periods
like multiple users accessing a single site at same time, increased load during morning time, etc. There are two
operational states based on the load of a production system,
Green Zone – State at which the system is operating under the normal load conditions. A system operating at this range
must be able to sustain response times within the acceptable latency and service level targets.
Red Zone – State at which the load is greater than the normal peak load, but can still provide service for a limited
period of time. In this state, there is a high chance of failures happening due to bottlenecks. The ultimate goal must
be to design the system accordingly to deploy an environment that can consistently support Red Zone load without
service failure and within acceptable latency and throughput targets.
Application Sizing
Application sizing must be used to estimate the resource requirements to support a proposed change to an existing
service or the implementation of a new service - to ensure that it meets its required service levels. To achieve this,
application sizing must be an integral part of the service lifecycle.
Application sizing has a finite lifespan. It is initiated at the design stage for a new service, or when there is a
major change to an existing service, and is completed when the application is accepted into the live operational
environment. Sizing activities should include all areas of technology related to the applications, infrastructure,
environment and data. Sizing activities is done using modelling and trending techniques.
During the initial requirements and design, the required service levels must be specified in the service level
requirements. This enables the service design and development to employ the pertinent technologies and products to
achieve a design that meets the desired levels of service. It is much easier and less expensive to achieve the required
service levels if service design considers the required service levels at the very beginning of the service lifecycle,
rather than at some later stage. Other considerations in application sizing are the resilience aspects required for the
design of new services. Technology Capacity Management must provide advice and guidance to the Technology Availability
Management process on the resources required to provide the required level of performance and resilience.
The sizing of the application should be refined as design and development progress. Modelling can be used during
application sizing. The resources to be utilized by the application are likely to be shared with other services, and
potential threats to existing SLA targets must be recognized and managed.
Tuning
The analysis of the monitored data must identify areas of the configuration that could be tuned to better utilize the
service, system and component resources or improve the performance of the particular service.
Tuning techniques that are of assistance include:
-
Balancing workloads and traffic: Transactions may arrive at the host or server at a particular gateway, depending
on where the transaction was initiated; balancing the ratio of initiation points to gateways.
-
Balancing disk traffic: Storing data on disk efficiently and strategically. For example, striping data across many
spindles can reduce data contention.
-
Definition of an accepted locking strategy: This specifies when locks are necessary at the appropriate levels. For
example, database, page, file, record and row. Delaying the lock until an update is necessary can provide required
benefits.
-
Efficient use of memory: This includes utilizing memory depending on the circumstances.
Before implementing any of the recommendations arising from the tuning techniques, it must be appropriate to consider
testing the validity of the recommendation
Implementation of recommendations: The objective of this activity is to introduce to the live operation services the
changes that have been identified by the monitoring, analysis and tuning activities. The implementation of any changes
arising from these activities must be undertaken through a strict, formal change management process. The impact of
system tuning changes can have major implications on the customer service. The impact and risk associated with these
types of change is likely to be greater than that of other types of change.
|